Unsupervised Structure Discovery for Semantic Analysis of Audio
نویسندگان
چکیده
Approaches to audio classification and retrieval tasks largely rely on detectionbased discriminative models. We submit that such models make a simplistic assumption in mapping acoustics directly to semantics, whereas the actual process is likely more complex. We present a generative model that maps acoustics in a hierarchical manner to increasingly higher-level semantics. Our model has two layers with the first layer modeling generalized sound units with no clear semantic associations, while the second layer models local patterns over these sound units. We evaluate our model on a large-scale retrieval task from TRECVID 2011, and report significant improvements over standard baselines.
منابع مشابه
Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures
Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...
متن کاملPresentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures
Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...
متن کاملStructured Models for Semantic Analysis of Audio Content
In the universe of audio signals, the notions of syntax, semantics, pragmatics, etc. have been associated with a very limited set of domains, such as speech and language, and musical analysis, to some extent. However, research efforts focussing on formalizing general notions of syntactic or semantic structure for universal audio analysis have been relatively limited. Prior work in analysis of a...
متن کاملLayered Dynamic Mixture Model for Pattern Discovery in Asynchronous Multi-modal Streams
We propose a layered dynamic mixture model for asynchronous multi-modal fusion for unsupervised pattern discovery in video. The lower layer of the model uses generative temporal structures such as a hierarchical hidden Markov model to convert the audiovisual streams into mid-level labels, it also models the correlations in text with probabilistic latent semantic analysis. The upper layer fuses ...
متن کاملCluster Based Cross Layer Intelligent Service Discovery for Mobile Ad-Hoc Networks
The ability to discover services in Mobile Ad hoc Network (MANET) is a major prerequisite. Cluster basedcross layer intelligent service discovery for MANET (CBISD) is cluster based architecture, caching ofsemantic details of services and intelligent forwarding using network layer mechanisms. The cluster basedarchitecture using semantic knowledge provides scalability and accuracy. Also, the mini...
متن کامل